Skip to content

[Enhancement] Classify unsupported-feature errors as client errors (4xx) on the SQL path#5569

Merged
RyanL1997 merged 4 commits into
opensearch-project:mainfrom
RyanL1997:fix/sql-unsupported-feature-4xx
Jun 19, 2026
Merged

[Enhancement] Classify unsupported-feature errors as client errors (4xx) on the SQL path#5569
RyanL1997 merged 4 commits into
opensearch-project:mainfrom
RyanL1997:fix/sql-unsupported-feature-4xx

Conversation

@RyanL1997

@RyanL1997 RyanL1997 commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Description

On the analytics-engine (Calcite) route, requests for features the engine does not implement leak to clients as HTTP 500 "internal problem at backend" — for example a SQL vectorSearch(...) table function (CalciteUnsupportedException: Table function is unsupported in Calcite). A 500 wrongly signals a server fault (prompting client retries) and pollutes the server-error metric / ops alarms, when the truth is "the client asked for something this engine does not support" — a client error (4xx).

The PPL path (RestPPLQueryAction) already classifies these as 400; the SQL path (RestSqlAction) was missing the equivalent, so the two engines were inconsistent for the same exception (e.g. PPL ad/kmeans → 400, SQL vectorSearch() → 500).

This PR makes unsupported-feature errors return a clean 4xx on the SQL path, consistent with PPL, and leverages the existing ErrorReport/ErrorCode framework so the classification is structured rather than purely exception-type based.

Changes

  • RestSqlAction.isClientError(): add QueryEngineException (the parent of CalciteUnsupportedException), mirroring RestPPLQueryAction.
  • RestSqlAction / RestPPLQueryAction getRawErrorCode(): honor an ErrorReport's structured ErrorCode (UNSUPPORTED_OPERATION, FIELD_NOT_FOUND, PERMISSION_DENIED, …) when mapping to an HTTP status, instead of always unwrapping to the cause. Backend/unknown codes fall back to the existing cause-classification, so nothing regresses. (This is the ErrorCode-based classification the PPL path's existing TODO anticipated.)
  • CalciteRelNodeVisitor.visitTableFunction(): surface the unsupported table function as a structured ErrorReport coded UNSUPPORTED_OPERATION with a user-facing suggestion, preserving the CalciteUnsupportedException as the cause so QueryService's v2-fallback detection still recognizes it.

Related Issues

N/A

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

… path

On the analytics-engine (Calcite) route, requests for features the engine does
not implement leaked to clients as HTTP 500 "internal problem at backend":

  - SQL `vectorSearch(...)` (a table function) -> CalciteUnsupportedException
  - other CalciteUnsupportedException cases on the SQL `_sql` endpoint

These are *client* errors ("you asked for something this engine does not
support"), not backend faults. Returning 500 wrongly signals a server fault
(prompting client retries) and pollutes the server-error metric / ops alarms.
The PPL path (RestPPLQueryAction) already classifies these as 400; the SQL path
(RestSqlAction) was missing the equivalent, so the two engines were
inconsistent for the same exception.

Changes:
  - RestSqlAction.isClientError(): add QueryEngineException (parent of
    CalciteUnsupportedException), mirroring RestPPLQueryAction so 4xx is
    consistent across SQL and PPL.
  - RestSqlAction / RestPPLQueryAction getRawErrorCode(): honor an
    ErrorReport's structured ErrorCode (UNSUPPORTED_OPERATION, FIELD_NOT_FOUND,
    PERMISSION_DENIED, ...) when mapping to HTTP status, instead of always
    unwrapping to the cause. Backend/unknown codes fall back to the existing
    cause-classification, so nothing regresses. This is the ErrorCode-based
    classification the PPL path's TODO anticipated.
  - CalciteRelNodeVisitor.visitTableFunction(): surface the unsupported table
    function as a structured ErrorReport coded UNSUPPORTED_OPERATION with a
    user suggestion, preserving the CalciteUnsupportedException as the cause so
    QueryService's v2-fallback detection still recognizes it.

Tests:
  - RestSqlActionErrorStatusTest, RestPPLQueryActionErrorStatusTest: status
    mapping for QueryEngineException, ErrorReport codes, fallback-to-cause, and
    genuine 500s.
  - CalciteRelNodeVisitorSearchSimpleTest: visitTableFunction throws an
    ErrorReport(UNSUPPORTED_OPERATION) preserving the CalciteUnsupportedException
    cause.
Signed-off-by: Jialiang Liang <jiallian@amazon.com>
@RyanL1997 RyanL1997 added enhancement New feature or request error-experience Issues related to how we handle failure cases in the plugin. SQL labels Jun 19, 2026
@RyanL1997 RyanL1997 changed the title Classify unsupported-feature errors as client errors (4xx) on the SQL path [Enhancement] Classify unsupported-feature errors as client errors (4xx) on the SQL path Jun 19, 2026
Signed-off-by: Jialiang Liang <jiallian@amazon.com>
The earlier ErrorCode->HTTP switch was too broad: it mapped INDEX_NOT_FOUND->400
and PERMISSION_DENIED->403, which changed the existing index-not-found (404)
contract that doctests assert (and that we agreed to leave to the team).

Revert getRawErrorCode to upstream (unwrap ErrorReport to cause) on both paths,
and revert RestPPLQueryAction entirely (the PPL path already classifies
QueryEngineException as a client error). The fix is now just:
  - RestSqlAction.isClientError() += QueryEngineException (mirrors PPL)
  - visitTableFunction throws ErrorReport(UNSUPPORTED_OPERATION) wrapping
    CalciteUnsupportedException, which maps to 400 via the existing cause-unwrap.

Tests updated to assert index-not-found stays 404 and only unsupported-feature
errors become 400.

Signed-off-by: Jialiang Liang <jiallian@amazon.com>

@dai-chen dai-chen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if we can simply add Calcite unsupported exception to the exception catch in UnifiedQueryPlanner. If it works, we can catch all such cases.

Move the fix into the planning boundary (UnifiedQueryPlanner.plan), which
already normalizes Calcite internals into a clean error taxonomy (e.g.
invalidTable -> SemanticCheckException). It previously re-threw
CalciteUnsupportedException raw via the QueryEngineException branch, so the SQL
REST path (which does not list QueryEngineException in isClientError) returned
HTTP 500 for an unsupported feature such as the vectorSearch() table function,
while PPL returned 400.

Convert CalciteUnsupportedException to SemanticCheckException here so every
consumer (SQL/PPL REST, Spark, CLI) classifies it consistently as a client
error (4xx) with no REST-layer change: SemanticCheckException is already a
client error on the SQL path and (as a QueryEngineException subclass) on the
PPL path. The unified-planner path is independent of QueryService's
v2-fallback, so this does not affect fallback behavior.

Test: unsupportedFeatureIsRethrownAsSemanticCheckException in
UnifiedQueryPlannerTest.

Signed-off-by: Jialiang Liang <jiallian@amazon.com>
@RyanL1997

Copy link
Copy Markdown
Collaborator Author

Please check if we can simply add Calcite unsupported exception to the exception catch in UnifiedQueryPlanner. If it works, we can catch all such cases.

Hi @dai-chen, good point.

The current approach fits the semantic check because the codebase already defines SemanticCheckException to include unsupported-feature errors, and because plan()'s existing contract is to funnel all "invalid for this engine" outcomes into it.

@dai-chen dai-chen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@RyanL1997 RyanL1997 merged commit d249a47 into opensearch-project:main Jun 19, 2026
28 of 32 checks passed
@RyanL1997 RyanL1997 deleted the fix/sql-unsupported-feature-4xx branch June 19, 2026 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request error-experience Issues related to how we handle failure cases in the plugin. SQL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants